Goto

Collaborating Authors

 information bit



A Study of Neural Polar Decoders for Communication

Hirsch, Rom, Aharoni, Ziv, Pfister, Henry D., Permuter, Haim H.

arXiv.org Artificial Intelligence

Abstract--In this paper, we adapt and analyze Neural Polar Decoders (NPDs) for end-to-end communication systems. While prior work demonstrated the effectiveness of NPDs on synthetic channels, this study extends the NPD to real-world communication systems. The NPD was adapted to complete OFDM and single-carrier communication systems. T o satisfy practical system requirements, the NPD is extended to support any code length via rate matching, higher-order modulations, and robustness across diverse channel conditions. The NPD operates directly on channels with memory, exploiting their structure to achieve higher data rates without requiring pilots and a cyclic prefix. Although NPD entails higher computational complexity than the standard 5G polar decoder, its neural network architecture enables an efficient representation of channel statistics, resulting in manageable complexity suitable for practical systems. Experimental results over 5G channels demonstrate that the NPD consistently outperforms the 5G polar decoder in terms of BER, BLER, and throughput. These improvements are particularly significant for low-rate and short-block configurations, which are prevalent in 5G control channels. Furthermore, NPDs applied to single-carrier systems offer performance comparable to OFDM with lower PAPR, enabling effective single-carrier transmission over 5G channels. Polar codes, introduced by Arıkan in 2009 [1], are the first class of codes proven to achieve the capacity of symmetric binary-input discrete memoryless channels (B-DMCs) under low-complexity successive cancellation (SC) decoding. In 5G, polar codes are primarily used for control channels, where high performance is required with a low rate and short code length. Their inclusion in the 5G New Radio (NR) standard for uplink and downlink control information, use cases such as enhanced mobile broadband (eMBB) and broadcast channel (BCH) highlight their practical relevance in modern wireless communication systems.


Code Rate Optimization via Neural Polar Decoders

Aharoni, Ziv, Huleihel, Bashar, Pfister, Henry D, Permuter, Haim H

arXiv.org Artificial Intelligence

This paper proposes a method to optimize communication code rates via the application of neural polar decoders (NPDs). Employing this approach enables simultaneous optimization of code rates over input distributions while providing a practical coding scheme within the framework of polar codes. The proposed approach is designed for scenarios where the channel model is unknown, treating the channel as a black box that produces output samples from input samples. We employ polar codes to achieve our objectives, using NPDs to estimate mutual information (MI) between the channel inputs and outputs, and optimize a parametric model of the input distribution. The methodology involves a two-phase process: a training phase and an inference phase. In the training phase, two steps are repeated interchangeably. First, the estimation step estimates the MI of the channel inputs and outputs via NPDs. Second, the improvement step optimizes the input distribution parameters to maximize the MI estimate obtained by the NPDs. In the inference phase, the optimized model is used to construct polar codes. This involves incorporating the Honda-Yamamoto (HY) scheme to accommodate the optimized input distributions and list decoding to enhance decoding performance. Experimental results on memoryless and finite-state channels (FSCs) demonstrate the effectiveness of our approach, particularly in cases where the channel's capacity-achieving input distribution is non-uniform. For these cases, we show significant improvements in MI and bit error rates (BERs) over those achieved by uniform and independent and identically distributed (i.i.d.) input distributions, validating our method for block lengths up to 1024. This scalable approach has potential applications in real-world communication systems, bridging theoretical capacity estimation and practical coding performance.


Protect Before Generate: Error Correcting Codes within Discrete Deep Generative Models

Martínez-García, María, Villacrés, Grace, Mitchell, David, Olmos, Pablo M.

arXiv.org Artificial Intelligence

Despite significant advancements in deep probabilistic models, learning low-dimensional discrete latent representations remains a challenging task. In this paper, we introduce a novel method that enhances variational inference in discrete latent variable models by leveraging Error Correcting Codes (ECCs) to introduce redundancy in the latent representations. This redundancy is then exploited by the variational posterior to yield more accurate estimates, thereby narrowing the variational gap. Inspired by ECCs commonly used in digital communications and data storage, we demonstrate proof-of-concept using a Discrete Variational Autoencoder (DVAE) with binary latent variables and block repetition codes. We further extend this idea to a hierarchical structure based on polar codes, where certain latent bits are more robustly protected. Our method improves generation quality, data reconstruction, and uncertainty calibration compared to the uncoded DVAE, even when trained with tighter bounds such as the Importance Weighted Autoencoder (IWAE) objective. In particular, we demonstrate superior performance on MNIST, FMNIST, CIFAR10, and Tiny ImageNet datasets. The general approach of integrating ECCs into variational inference is compatible with existing techniques to boost variational inference, such as importance sampling or Hamiltonian Monte Carlo. We also outline the key properties ECCs must have to effectively enhance discrete variational inference.


JaxLife: An Open-Ended Agentic Simulator

Lu, Chris, Beukman, Michael, Matthews, Michael, Foerster, Jakob

arXiv.org Artificial Intelligence

Human intelligence emerged through the process of natural selection and evolution on Earth. We investigate what it would take to re-create this process in silico. While past work has often focused on low-level processes (such as simulating physics or chemistry), we instead take a more targeted approach, aiming to evolve agents that can accumulate open-ended culture and technologies across generations. Towards this, we present JaxLife: an artificial life simulator in which embodied agents, parameterized by deep neural networks, must learn to survive in an expressive world containing programmable systems. First, we describe the environment and show that it can facilitate meaningful Turing-complete computation. We then analyze the evolved emergent agents' behavior, such as rudimentary communication protocols, agriculture, and tool use. Finally, we investigate how complexity scales with the amount of compute used. We believe JaxLife takes a step towards studying evolved behavior in more open-ended simulations. Our code is available at https://github.com/luchris429/JaxLife


Deep Polar Codes

Choi, Geon, Lee, Namyoon

arXiv.org Artificial Intelligence

In this paper, we introduce a novel class of pre-transformed polar codes, termed as deep polar codes. We first present a deep polar encoder that harnesses a series of multi-layered polar transformations with varying sizes. Our approach to encoding enables a low-complexity implementation while significantly enhancing the weight distribution of the code. Moreover, our encoding method offers flexibility in rate-profiling, embracing a wide range of code rates and blocklengths. Next, we put forth a low-complexity decoding algorithm called successive cancellation list with backpropagation parity checks (SCL-BPC). This decoding algorithm leverages the parity check equations in the reverse process of the multi-layered pre-transformed encoding for SCL decoding. Additionally, we present a low-latency decoding algorithm that employs parallel-SCL decoding by treating partially pre-transformed bit patterns as additional frozen bits. Through simulations, we demonstrate that deep polar codes outperform existing pre-transformed polar codes in terms of block error rates across various code rates under short block lengths, while maintaining low encoding and decoding complexity. Furthermore, we show that concatenating deep polar codes with cyclic-redundancy-check codes can achieve the meta-converse bound of the finite block length capacity within 0.4 dB in some instances.


Deepcode and Modulo-SK are Designed for Different Settings

Kim, Hyeji, Jiang, Yihan, Kannan, Sreeram, Oh, Sewoong, Viswanath, Pramod

arXiv.org Machine Learning

We respond to [1] which claimed that "Modulo-SK scheme outperforms Deepcode [2]". We demonstrate that this statement is not true: the two schemes are designed and evaluated for entirely different settings. DeepCode is designed and evaluated for the AWGN channel with (potentially delayed) uncoded output feedback. Modulo-SK is evaluated on the AWGN channel with coded feedback and unit delay. [1] also claimed an implementation of Schalkwijk and Kailath (SK) [3] which was numerically stable for any number of information bits and iterations. However, we observe that while their implementation does marginally improve over ours, it also suffers from a fundamental issue with precision. Finally, we show that Deepcode dominates the optimized performance of SK, over a natural choice of parameterizations when the feedback is noisy.


Simple Modulo can Significantly Outperform Deep Learning-based Deepcode

Ben-Yishai, Assaf, Shayevitz, Ofer

arXiv.org Machine Learning

Deepcode (H.Kim et al.2018) is a recently suggested Deep Learning-based scheme for communication over the AWGN channel with noisy feedback, claimed to be superior to all previous schemes in the literature. Deepcode's use of nonlinear coding (via Deep Learning) has been inspired by known shortcomings (Y.-H. Kim et al 2007) of linear feedback schemes. In 2014, we presented a nonlinear feedback coding scheme based on a combination of the classical SK scheme and modulo-arithmetic, using a small number of elementary operations without any type of neural network. This Modulo-SK scheme has been omitted from the performance comparisons made in the Deepcode paper, due to its use of common randomness (dither), and in a later version since it was incorrectly interpreted as a variable-length coding scheme. However, the dither in Modulo-SK was used only for the standard purpose of tractable performance analysis, and is not required in practice. In this short note, we show that a fully-deterministic Modulo-SK (without dithering) can outperform Deepcode. For example, to attain an error probability of 10^(-4) at rate 1/3 Modulo-SK requires 3dB less feedback SNR than Deepcode. To attain an error probability of 10^(-6) with noiseless feedback, Deepcode requires 150 rounds of communication, whereas Modulo-SK requires only 15 rounds, even if the feedback is noisy (with 27dB SNR). We further address the numerical stability issues of the original SK scheme reported in the Deepcode paper, and explain how they can be avoided. We augment this report with an online-available, fully-functional Matlab simulation for both the classical and Modulo-SK schemes. Finally, note that Modulo-SK is by no means claimed to be the best possible solution; in particular, using deep learning in conjunction with modulo-arithmetic might lead to better designs, and remains a fascinating direction for future research.


Deepcode: Feedback Codes via Deep Learning

Kim, Hyeji, Jiang, Yihan, Kannan, Sreeram, Oh, Sewoong, Viswanath, Pramod

Neural Information Processing Systems

The design of codes for communicating reliably over a statistically well defined channel is an important endeavor involving deep mathematical research and wide- ranging practical applications. In this work, we present the first family of codes obtained via deep learning, which significantly beats state-of-the-art codes designed over several decades of research. The communication channel under consideration is the Gaussian noise channel with feedback, whose study was initiated by Shannon; feedback is known theoretically to improve reliability of communication, but no practical codes that do so have ever been successfully constructed. We break this logjam by integrating information theoretic insights harmoniously with recurrent-neural-network based encoders and decoders to create novel codes that outperform known codes by 3 orders of magnitude in reliability. We also demonstrate several desirable properties in the codes: (a) generalization to larger block lengths; (b) composability with known codes; (c) adaptation to practical constraints. This result also presents broader ramifications to coding theory: even when the channel has a clear mathematical model, deep learning methodologies, when combined with channel specific information-theoretic insights, can potentially beat state-of-the-art codes, constructed over decades of mathematical research.


Deepcode: Feedback Codes via Deep Learning

Kim, Hyeji, Jiang, Yihan, Kannan, Sreeram, Oh, Sewoong, Viswanath, Pramod

Neural Information Processing Systems

The design of codes for communicating reliably over a statistically well defined channel is an important endeavor involving deep mathematical research and wide- ranging practical applications. In this work, we present the first family of codes obtained via deep learning, which significantly beats state-of-the-art codes designed over several decades of research. The communication channel under consideration is the Gaussian noise channel with feedback, whose study was initiated by Shannon; feedback is known theoretically to improve reliability of communication, but no practical codes that do so have ever been successfully constructed. We break this logjam by integrating information theoretic insights harmoniously with recurrent-neural-network based encoders and decoders to create novel codes that outperform known codes by 3 orders of magnitude in reliability. We also demonstrate several desirable properties in the codes: (a) generalization to larger block lengths; (b) composability with known codes; (c) adaptation to practical constraints. This result also presents broader ramifications to coding theory: even when the channel has a clear mathematical model, deep learning methodologies, when combined with channel specific information-theoretic insights, can potentially beat state-of-the-art codes, constructed over decades of mathematical research.